


THE JOURNAL OF 
EDUCATIONAL PSYCHOLOGY 








Volume XXXVII October, 1946 Number 7 
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RELIABILITY OF ITEM-ANALYSIS DATA 


FREDERICK B. DAVIS 


American Council on Education 


In a recent publication,' the writer presented a chart from 
which a difficulty index and a discrimination index for each one of 
any group of test items may be read simultaneously. The 
rapidity and convenience with which item-analysis data can be 
obtained by means of this chart are due in part to the fact 
that approximation procedures are employed in calculating the 
percentages used to enter it. Although these approximation pro- 
cedures are entirely reasonable, their use does prevent the calcula- 
tion of standard errors for the discrimination and difficulty 
indices except by empirical means. 

After printer’s copy had been prepared for the publication on 
item-analysis data that is referred to above, it became possible 
to calculate the correlation coefficient between discrimination 
indices obtained for a set of eighty-six test items administered 
to two different random samples of three hundred seventy men 
drawn from the same population. Likewise, the correlation 
coefficient between difficulty indices obtained for the same test 
items in the same samples was calculated. These two correla- 
tion coefficients indicate the extent to which, under specified 
circumstances, the indices derived from the Davis Item-Analysis 
Chart are consistent from sample to sample. 

The relative importance of the many factors that determine 
the selection of items for the final form of any particular test is a 
matter that must be left to the professional judgment of the test 





1 F. B. Davis, Item-Analysis Data: Their Computation, Interpretation, and 
Use in Test Construction. Cambridge: Harvard Graduate School of Educa- 
tion, 1946. 
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constructor. However, it is obvious that very little importance 
could ever be attached to item-analysis data if they could not be 
depended. upon to place the items in essentially the same rank 
order from sample to sample. Hence the desirability of obtain- 
ing the correlation coefficients between discrimination and 
difficulty indices for the same set of test items from different 
but comparable samples of the same size. 

In actual practice, it is unusual to find that every testee has had 
time to read every item in set at administered to obtain data for 
item-analysis purposes, but an effort should be made to approxi- 
mate this condition. Otherwise, the number of cases on which 
the item-analysis data for a given item near the end of the test 
are based may be so small that the discrimination and difficulty 
indices obtained for that item are sufficiently unreliable as to be of 
no practical use. The indices may also be misleading because the 
small sample on which they are based is not likely to be entirely 
representative. To avoid these difficulties, the discrimination 
and difficulty indices used for correlational purposes were based 
on items in a test administered in such a way that every testee 
read every item. From a population of over one thousand 
aviation cadets, the answer sheets for two random samples of 
three hundred seventy men each were drawn. The computa- 
tional procedures specified in Chapter V of the monograph that 
includes the Davis Item-Analysis Chart! were followed precisely, 
and the product-moment correlation coefficients between the two 
sets of indices were calculated. 

Following are the data for the discrimination indices from 


samples A and B: 


N Ma Ms Ca CB TAB 
86 25.14 24.10 14.30 11.75 . 58 


Data for the difficulty indices are as follows: 


N Ma Ms Oa os TAB 
86 48.42 48.05 22.33 22.08 .98 


The computational procedures recommended by the writer for 
use with the item-analysis chart include a correction for chance 





1F. B. Davis, Op. Cit., pp. 30-38. 
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success. With that correction omitted, the correlation of the 
resulting sets of discrimination indices becomes: 


N Ma Ms Ca OB TaB 
86 18.73 18.03 10.14 7.85 .65 


The difference between the coefficients of .58 and .65 is not 
large enough to be considered either statistically or practically 
significant. 

Internal-consistency discrimination indices are most commonly 
used in test construction to select from among items that have 
other acceptable properties those which are most likely to dis- 
criminate between individuals who differ significantly with 
respect to the trait or traits that it is desired to measure. Ordi- 
narily, no predetermined minimum amount of item discriminating 
power is established; the test constructor simply arranges the 
items in rank order with respect to discriminating power, starts 
from the top of the list with the most discriminating item, and 
selects as many items as he needs that possess the various other 
required characteristics, such as suitability of subject matter, 
level of difficulty, etc. It is apparent that if the discrimination 
indices he employs are based on samples of about four hundred 
testees, he will have arranged the items in an order only roughly 
similar to their true rank order of discriminating power. Conse- 
quently, he should not be too concerned about rejecting, on the 
basis of professional judgment, an item found to be among the 
most discriminating; neither should he worry very much about 
accepting an item judged to be excellent just because it has a 
fairly low discrimination index. 

This does not mean that the item discrimination indices should 
be taken lightly or that there is no need to compute them. It 
does mean that they constitute only one fallible guide to the 
selection of test items and should be used intelligently, not 
blindly, as just that. It is likely that samples larger than four 
hundred should be used for internal-consistency item-analysis 
purposes when the total score on the experimental form of the 
test is actually a rather satisfactory criterion, as is the case when 
the items are quite homogeneous in content (in a general- 
vocabulary test, for example). When a test of this sort is being 
constructed, the internal-consistency item discrimination indices 
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might be so important a consideration in the selection of specific 
items that it would be worth while to obtain highly reliable 
discrimination indices. As the content of the items in a test 
becomes more heterogeneous, the internal-consistency discrimina- 
tion indices become a less important consideration in the selection 
of items, their reliability is not so crucial, and the size of the 
sample used for item-analysis purposes can be reduced. 

The difficulty indices yielded by the Davis Item-Analysis Chart 
when a sample of at least three hundred seventy cases is employed 
are exceptionally reliable, as indicated by the correlation coeff- 
cient of .98 between indices computed from two different samples. 
This is fortunate because for practical purposes in test construc- 
tion the distribution of difficulty indices of the items in a test 
should be closely controlled in order to achieve maximum 
efficiency of measurement—much more closely controlled, in 
fact, than is commonly supposed by many test constructors. 
It has been shown that the shape of the distribution of item 
difficulty indices that is optimum for a given test is determined 
by the degree of intercorrelation of the items and by the special 
purpose for which the test is to be used. Only under very 
peculiar circumstances, for example, would it be desirable to 
construct a test in which every item would be of fifty per cent 
difficulty for the group in which it was to be used. 

Since the item discrimination indices and the item difficulty 
indices derived from the Davis Item-Analysis Chart should 
properly be used independently in the process of selecting items 
for the final form of a test, it is of importance to know the relation- 
ship between the two sets of indices. For this reason, the cor- 
relation between them in each of the two samples was calculated. 
For the indices corrected for chance, as recommended by the 
writer, the resulting data are as follows: 


N Mpitt. Mpiee. Opift. o Disc. r 
Sample A 

86 48.42 25.14 22.33 14.30 11 
Sample B 

86 48.05 24.10 22.08 11.75 .02 


Neither of these coefficients is significantly different from zero 
and both of them are very low. It seems likely that we can 
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safely generalize by concluding that difficulty and discrimination 
indices obtained from the Davis Item-Analysis Chart by means 
of the recommended procedures are likely to be essentially 
uncorrelated. 

To determine the relationship between discrimination and 
difficulty indices when the correction for chance is not employed, 
the correlation coefficient between uncorrected difficulty and 
discrimination indices was computed, using data for Sample A 


only. 


N M pitt. M pie. T pitt. T Disc. r 
86 56.16 18.73 14.93 10.14 41 


The coefficient of .41 is considerably higher than that of . 11 for 
the corrected indices in Sample A. Apparently, one previously 
unclaimed advantage of making a correction for chance in the 
computation of item-analysis data is that it greatly reduces the 
correlation between difficulty and discrimination indices and thus 
enhances their efficiency and their usefulness in test construction. 

One of the grave disadvantages of using statistics such as the 
product-moment r, the point biserial r, and the phi coefficient as 
indices of item discriminating power is that they vary systemat- 
ically with the level of item difficulty. The data presented above 
indicate that item discrimination indices obtained from the 
Davis Item-Analysis Chart are essentially free from this defect, as 
theoretically they should be. 


SUMMARY 


Data based on two comparable samples of three hundred 
seventy aviation cadets show that item difficulty indices obtained 
from the Davis Item-Analysis Chart are exceedingly reliable. 
Item discrimination indices obtained from the same chart are 
only moderately reliable when a sample of about four hundred 
testees is used. 

It is fortunate that the item difficulty indices are highly reliable 
since these are ordinarily more important and more useful to the 
test constructor than are the discrimination indices. When the 
latter are intended to be of primary importance in the selection of 
test items, they should be computed on the basis of samples of 
eight hundred or more trustees. In general, the more homogene- 
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ous the content of the test items, the more importance may 
properly be attached to discrimination indices in the selection of 
items. 

To minimize the correlation of difficulty and discrimination 
indices, it is desirable to make a correction for chance in comput- 
ing item-analysis data. Other logical considerations also favor 
correction for chance. 














TRANSFER OF TRAINING AGAIN 
A. R. MEAD 


University of Florida 


The writer has noted several examples of a revival of the older, 
unmodified doctrine of formal discipline. One of these is the 
article by Withers on “‘Latin, Law, and Medicine,” in the Educa- 
tional Forum for January, 1945. This article seems to be a 
revival of materials published long ago in such works as Taylor’s 
Classical Study, and, in more recent times, championed by Dean 
West of Princeton and others. The frequent reference to 
‘disciplines’ very likely connotes much of the same doctrine. 
Certain phases of it are involved in the statements of such men as 
President Hutchens and Professor Adler of the University of 
Chicago. The writer has heard many oral statements assuming 
this doctrine by college and university faculty members. It 
seems, therefore, that known facts about this doctrine are not 
being used. In the article referred to above, the evidence, such 
as it is, comes from one side of a many-sided problem. It seems 
that the matter should be revised in light of facts known about 
so-called transfer values. Such careful study of this problem as 
has been made does not deny the possibility of transfer under 
favorable conditions. Such study does indicate several possi- 
bilities, among which positive transfer is one. (See Thorndike’s 
Educational Psychology, Vol. 11, pp. 351-357, 1913.) A large 
number of investigations have been made, some of which refer 
to the transfer values of study of Latin. In an article published in 
1935, Orata includes a comprehensive bibliography on the subject. 
Among these the following relate to interference, or negative 
effects, rather than transfer: Bergstrom, Jastrow and Cairnes, 
McMann and Washburn, Poffenberger. The following deal with 
transfer effects from the study of Latin: Carr, Coxe, Dallam, 
Gilliland, Hamblen, Harris, Haskell, Heald, Hill, Newcomb, Otis, 
Perkins, Robinson, Smith (D. R.), Thorndike and Ruger, (four 
separate articles), Wilcox. There are also many related studies. ! 

The writer does not profess to tell lawyers, physicians, nurses, 
etc., what they are, should be and do in their work. He does not 





1Orata, Pedro. ‘‘Transfer of Training and Educational Pseudo-science,”’ 
Educational Administration and Supervision. Vol. xx1, pp. 241-264, April, 
1935. 
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possess the data on which to base that advice. He does believe 
that he, along with others in similar professional work, is some- 
what informed in the professional research and literature of 
his own field. Advocates of the older doctrine seem negligent in 
that they ignore a large volume of data on the transfer of training. 
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BETERIORATION IN TYPING 


It does seem queer that after a long period of discussion of this 
problem, the opinions are not checked against the facts in the case. 
From some investigations made in the decades of 1890 and early 
1900’s, there came the following facts about transfer. 

First, the question at issue is, does and acquired ability (or 
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function) remain a specialized ability, or does it ‘spread’ to and 
strengthen other abilities? For example, will a knowledge of 
Latin grammar improve one’s understandings of English gram- 
mar, and vice versa? If one becomes skilled in solving mathe- 
matical problems, printed in textbooks, will that improve his 
reasoning ability in finding practical solutions to a group of social 
problems? Will a knowledge of the structure of Latin language 
improve one’s understanding (and use) of the English? 

Second, the answers found to the problem of transfer of training 
when reduced to experimentation and research are not uniform. 
Instead, they differ. 

Third, early in the experimentation there were found some five 
or six possibilities of effects of training in one ability upon another 
ability. These may be pictured as in the accompanying figures. 


EXPLANATION OF DIAGRAMS 


In these illustrations, arrows pointing right indicate improve- 
ment; pointing left, indicate deterioration. In Figure I, improve- 
ment in A (use of English) spreads over and helps (strengthens) 
B. Incidentally, this seems to be what does usually occur in most 
cases of the two subjects listed, and with our own English speaking 
high school students. 

In Figure II, improvement in Latin spreads to B, but cause it to 
be less strong, to deteriorate. 

In Figure III, A weakens, while B increases. 

In Figure IV, both abilities deteriorate. 

In Figure V, A weakens and B improves. 

The above are theoretical possibilities. Let us take Withers’ 
article as an example. He has before him, then, this question: 
Does the study of Latin in high school help in Law and Medicine, 
or interfere with knowledge of Law and Medicine, or any one of | 
the other possibilities? 

Smith, in a study of ‘translation-thinking’ to other abilities, 
found varied results, just as described above. Some children 
were helped; others hindered; in other cases, no effect was dis- 
covered. Woodring’s study of quality of Latin translation 
showed that the careless translation led to a more careless use of 
English—and that careless translation was the rule. 

There are frequent conflicts of meanings with language forms in 
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different languages. The verb ‘met’ is spelled alike in English 
and French, but the meanings are different. 

After the long study of the problem, a quite general conclusion 
was reached that “the disciplinary value of a subject is not a 
sound basis for requiring people to study it. In other words, each 
subject must stand or fall by its own values. Furthermore, 
extended examination of relationship of work in high school and 
the later work in college or university, shows that quality of work, 
irrespective of subject, is one of the best indicators of later ability. 

The writer recommends that any person making an essay into 
this field prepare a digest of the many experiments and analyze 
them; then supply his legal and medical friends this information. 
What would happen? It would be interesting to ascertain how 
many would try to adjust their opinion in light of these facts, and 
how many would continue with their present attitude. In order 
that neither the law, or medicine, or author Withers will be at a 
loss, the writer lists a sampling of the literature on the problem— 
and it does not point to the single conclusion that they so fre- 
quently assume. 

A final question is this: How long will it take for the facts 
known about transfer of training to be used, and adjustments 
made accordingly? One hundred years? Or never? 
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A STUDY COMPARING ART ABILITIES AND 
GENERAL INTELLIGENCE OF 
COLLEGE STUDENTS 


EDNA A. BOTTORF 
State Teachers College, Lock Haven, Pennsylvania 


The purpose of this study was to find out what relation appears 
to exist between the general intelligence of college students and 
their abilities in art. 

General intelligence may be inferred from IQ’s as found through 
standard intelligence tests and from attainments in the various 
school subjects as shown by teachers’ grades. Art abilities may 
be evaluated through the teachers’ grades assigned to various art 
problems requiring differing abilities, where the teachers’ grades 
were secured in as objective a manner as possible. 

In this study, therefore, the art abilities of college students (as 
represented by grades they received in art) were compared with 
their respective I1Q’s (as secured by a standard intelligence test 
they took) as well as with grades secured by these students in the 
various college subjects. 

Further than this, there was an effort to see if those students 
showing definite art ability—as represented by their art scores— 
had similar interests or excelled in the same subjects other than 


art. 


PrREv10ous RELATED STUDIES 


Numerous studies have been conducted with reference to gen- 
eral intelligence and to art ability of some sort. Most of these 
studies indicate that definite ability in art is accompanied by good 
general intelligence, but that good general intelligence is not nec- 
essarily accompanied by marked ability in art. Some of the 
authors claim that lack of good general intelligence does not 
always mean a lack of ability in art. 

Art ability has been determined by many methods. Some 
dealt with exceptional talent. Among the earliest of these were 
studies of Kik?° and Kerschensteiner,'® which Manuel?? claims 
were largely non-experimental. It is interesting to note that 
these early studies both considered good intellectual endowment 
an accompaniment of great talent, but not great talent neces- 
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sarily an accompaniment of good intellectual endowment. 
Albien,’ on the other hand, disagreed since some of those he 
studied, although well-endowed in drawing ability, were not of 
high intelligence. Manuel?? agreed with Kik and Kerschen- 
steiner. More recent studies concerned with exceptional talent 
are those by Hollingsworth'* and Terman.?® Hollingsworth and 
Manuel thought exceptional ability in art might be found at times 
in individuals of low general intelligence, but Terman doubts the 
type of ability being measured here. He suggests it is a copying 
ability. He says “ without superior general intelligence, special 
ability in music and art inevitably falls short of really great 
achievement.”’ Davis* claims; “In drawing, there seems to be a 
slight tendency to rate gifted pupils higher than normal children, 
but this may be based largely upon their intellectual appreciation 
oftheart...... rather than upon any superior innate ability 
of graphic representation.”” Among all these studies, art ability 
was commonly accepted as drawing or ‘representative’ ability, 
and in most cases its presence was found by teachers’ ratings or 
recommendations. 

Druley’® found fourth-, fifth-, and sixth-grade children (gifted 
in art) superior in their ability to judge art work to an unselected 
group in intelligence and in other factors. Winslow* found a 
similar situation with his ninth-grade gifted pupils, selected on a 
basis of drawings rated with the Kline and Carey Measuring 
Scale and the Klar Scoring device. 

Other studies have been conducted that were concerned with 
individuals of low rather than of high general intelligence. 
Goodenough! devised a scale to measure intelligence by means of 
a drawing-of-a-man test. This test has been used in many cases 
frequently with subnormal children—and with conclusive results. 
Examples are the studies by Barrien* and Yepsen.** 

Several studies were concerned with art ability as determined in 
various ways among a normal group of children. Beal,* Mohn- 
ike,2* and Dietsch®? compared drawing ability with intelligence 
scores. Some found that art ability was closely correlated with 
general intelligence up to the age of adolescence when the ability 
seemed to take on the character of a special ability and to have 
low but positive correlation with intelligence. The correlations 
ranged from slightly above zero up to .3 or .4, but rarely exceed- 
ing .5. Goodenough" found this true in comparing ability in the 
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drawings of elementary-school children with both school grades 
and intelligence. Tiebout and Meier*! found correlations of .53 
in the first three grades with a drop to —.07 in the eighth grade 
between creative drawing ability and intelligence. The average 
1Q for the artistically superior on the high-school level was found 
to be between 107 and 109. Among adult successful artists they 
found superior intelligence exhibited, although the rank as an 
artist and his rank in intelligence had no definite correlation. 
They conclude: “ Artistic ability is‘a special ability in the sense of 
being only somewhat related to general intelligence as measured 
by established tests. This applies primarily to the normal group. 
In the case of the selected group there is a tendency for a higher 
than average degree of intelligence to be present with artistic 
superiority.” 

Peck?’ found a similar drop in correlation of drawings and intel- 
ligence at the mental age of nine years. Bird® found a range of 
correlation from .14 to .51 between drawings of children and their 
intelligence scores secured from Goodenough’s drawing-of-a-man 
test. Lewerenz,”! on a battery of tests measuring seven skills and 
abilities in art, found very low correlations (.009 to .295) with 
intelligence scores when testing children from the third grade 
through the senior high school. He concludes that it is probably 
true that anyone who succeeds exceptionally well in art will also 
rank rather high on an intelligent test; however, a high intelli- 
gence test score does not necessarily bring a corresponding ability 
inart. Both Hoag!’ and McGeoch* compared ability in creative 
imagination with intelligence and found low correlations (—.004 
to .265, McGeoch). 

Several investigations have been conducted using scores made 
on art appreciation tests and intelligence scores. Eurich and 
Carroll!! report correlations of .10 and .26 between these criteria 
with college students. Garrels' also found a positive correlation 
between art appreciation and general intelligence existing among 
sixth-grade children. 

Many experiments have been carried out investigating the 
relationship existing between ability in art and ability in other 
school subjects as shown by grades received in these subjects. 
Again ability in art was judged in different ways. Speaking of his 
gifted individuals, Terman?® suggests that they do work of supe- 
rior quality in those subjects which require abstract thought but 
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they do only slightly better than average in those subjects which 
depend primarily upon manual dexterity or special talent; how- 
ever, even in those subjects they excel the normal group to some 
extent. His subjects were all below the college level. Using the 
Knauber Art Ability Test to determine the art ability of a junior- 
high-school group, Gunn’ reported a close relationship between 
this criterion and ability in general school subjects. He com- 
puted no coefficients of correlation. Winslow** and Druley?® 
both report their gifted children as superior in other school sub- 
jects as well. 

Peck?’ correlated the scores of his subjects on a drawing test 
with those they secured on a reading test and found r’s of .47 and 
.55. Carroll’ found a low correlation of .24 between appreciation 
of literature and appreciation of art among college students which 
was larger than the correlation between appreciation of music and 
appreciation of art as shown by the same group. 

Ayer? evaluated the correlations between rank in school draw- 
ing and rank in other school subjects using teachers’ grades of one 
hundred forty-one normal-school pupils and found an r of .49 with 
mathematics, .68 with English, .68 with music, .73 with education, 
.80 with history, .80 with science—.66 with all. He stated that 
the drawing grades were computed from a number of separate 
factors which he listed as: (a) ability in representative drawing; 
(b) ability in designing; (c) ability in artistic discrimination; (d) 
ability with color, washes, shading, etc.; (e) attendance; (f) disci- 
pline; and (g) vocational interests. These correlations seem 
high, especially when he finds practically no correlation between a 
drawing of a turkey feather by high-school pupils and science, 
English, and mathematic grades. He explains this drop by 
attributing it to the isolation in this instance of the drawing 
ability which he does not believe to be correlated with school sub- 
ject grades. 

Hollingsworth!* reports several studies in which art grades were 
correlated with other subject grades. Among these was one by 
Weglein®? in which correlations between drawing and other sub- 
jects in high school were found to be .37 and .15 with English, .13 
with history, and .09 with algebra. 

A few investigations have been conducted studying the emo- 
tional stability of those superior in art ability. Eurich and Ca-- 
roll'? and Druley’® found their superior art pupils also superior in 
emotional stability. 
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One other study might be mentioned. In an investigation of 
the San Francisco public schools,?* the junior-high-school pupils 
electing art courses were found to be similar in mental abilities to 
the general level in each group concerned. 

Only five of these studies dealt with the college level. Most of 
them were based on art ability as determined by scores on art 
tests, on scores from a single drawing, or on a teacher’s recom- 
mendation or rating. If a teacher’s rating or art grades were 
used, these were secured in the usual manner, covered many 
separate art abilities, and included such factors as were mentioned 
by Ayer (attendance, discipline, vocational interests, test results, 
etc.). A rating or score secured from one drawing does not seem 
fair, particularly when even an accomplished individual varies 
considerably in the quality of his performances. Scores secured 
from a single test are likewise of doubtful validity where the 
test is checking creative abilities and is necessarily of brief com- 
pass and would show a similar handicap that the single drawing 
exhibited. Art problems require time for a full completion and 
tests must necessarily be covered in a fairly short period. Also 
ideas may not come readily, and they can not then be given due 
consideration. 

It appeared necessary in this study to find a method of securing 
an art score that seemed a fairer estimate of a person’s ability and 
that was based on a more representative sampling of the ability. 

Accordingly it was decided* to use as each art ability score the 
average score from several problems representing that ability— 
the problems themselves were varied and were scored in an objec- 
tive manner by several individuals. 


SECURING THE DATA 


Various art problems seem to call for various abilities. Con- 
struction Problems, Creative Designs, Perspective Drawing, Art 
Appreciation and Art History were the phases of art chosen for 
this study because of the variety of abilities needed in each. 

In Construction Problems, the chief requirements seemed to be 
ability to follow directions and manipulative skill, rather than 
thinking or making decisions. Creative Designs seemed to 
require imagination and originality, thinking, and a fine sense of 





* A preliminary study was conducted to try out several procedures and 
methods of attacking the problem. 
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discrimination. In Perspective Drawing, facts discovered in one 
situation had to be applied to an entirely new, although similar 
situation, and appeared to require insight, thinking, and a making 
of choices. Art Appreciation depended upon a power of analysis, 
critical judgment, and a fine sense of discrimination and feeling. 
Art History was almost entirely a recognition of facts and forms 
previously studied. The criteria used in judging each phase 
covered these fundamental abilities and their manifestation in the 
finished results. 

In the Construction Problems, objects of a mechanical type 
were used; for instance, construction of a folder, letter folio, 
memorandum pad, notebook cover, simple bookbinding and 
weaving problems, and the cutting of letters from squared paper 
or squares of paper with a pattern to observe while working. 
Measurements were dictated and instructions were given as the 
work proceeded. Mechanical accuracy (represented by correct 
measurement and by fitted and sharp edges and corners) and 
neatness (such as lack of smudges, roughness, and paste marks) 
were emphasized and used as the criteria in judging each example. 

In this and in the other phases of art considered, the final scores 
used were secured by averaging the initial scores made on the 
various problems used in each phase. ‘The initial scores were 
secured usually through a committee or group decision of six to 
nine people who were studying or teaching art. Standards were 
set up and the problems arranged in twelve groups.* The first 
group contained those examples—in the case of the Construction 
Problems—which possessed mechanical accuracy and showed per- 
fect measurements, fitted and sharp edges, no air bubbles in 
pasted sections, and no paste smudges. The examples showing 
the most inaccuracies—greatest variations in measurements, most 
poorly cut (roughest) and least well-fitted edges and pasted sur- 
faces, as well as paste marks—were placed in the lowest group. 
The examples were arranged progressively from the lowest to the 





*In order to simplify the procedure an attempt was made to divide 
examples into the usual five groups only. However, it was found difficult 
to place examples in so few groups and much easier to use the twelve-group 
plan. The averaging of scores was found to give approximately the same 
results in either case, so the twelve-group plan was used. The Art Appreci- 
ation and Art History scores were each secured from tests which naturally 
were scored in a different manner, as explained later. 
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highest by the amount of variation in accuracy according to these 
criteria as shown in Table I. 


TABLE I.—CRITERIA FOR SCORING PROBLEMS 


Points or 
Numerical 
Standards (General) Group Value Grade 
Special Ability or Few or No Faults. 1 11 A 
2 10 A- 
Better Than Average Ability or Re- 3 9 B+ 
sults,’ but not of Group 1 or2 Quality. 4 8 B 
5 7 B- 
Average Results.......... 6 6 C+ 
7 5 C 
8 4 C-— 
Inferior Results.................. 9 3 D+ 
10 2 D 
11 l D- 
Little or No Value. Failure Work.. 12 0 E 


Occasionally, in some problems no example seemed to be worth 
an A, or none appeared to lack at least a D value. At times, too, 
there seemed to be a definite gap between groups—for instance, 
between unusual work of an A grade and work of a better than 
average or B grade. In such circumstances there were less than 
twelve groups with gaps appearing in the grading. For instance, 
one problem might have ten groups, no A’s or B—’s, but A—’s, 
B+’s, B’s, C+’s, C’s, C—’s, D+’s, D’s, D—’s, and E’s. The 
final scores were secured by averaging the points secured by each 
individual in the various problems of each phase of art. 

In the Creative Design problems, a study of the principles 
underlying good design preceded actual work. The first designs 
were done in black and white with India ink and lettering pens; 
later designs were worked out directly in water color on paper. 
Abstract designs were emphasized. The first problems were for 
practice in designing; the later ones were applied to actual objects, 
such as decorated imaginative toys, wooden and cardboard 
boxes, a table mat, a block-printed Christmas card, monograms, 
decorated capitals, andastenciledrunner. A variety of materials 
was used. All designs, applied or not, were used in securing the 
grades for each individual. The criteria employed in scoring the 
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practice designs were: originality shown; application of the design 
principles and the principles of color harmony; and conformity to 
directions (whether unit, border or surface pattern; whether 
bi-symmetric, occult, or radiating balance; whether fitting 
rectangle, circle, triangle, or shape given). The last criterion, 
because of its rather mechanical aspects, was given the least 
weight in judging. Again the grouping of examples ran from 
those embodying the greatest degree of each criterion, as decided 
by the judges, to those showing the least degree. In the case of 
applied designs, the same criteria were used as well as the addi- 
tional measures: suitability of design to the shape, material and 
use of the object decorated; and suitability of the design to the 
material used in the decorating. 

In Perspective Drawing, the principles of linear perspective— 
elliptical, parallel and angular—were deduced from the study of 
objects in the classroom. These objects were then drawn; the 
drawings were criticized and corrected. Additional problems 
were then assigned, including the drawings of: a tower; a group of 
bowls; and a group of objects in parallel, angular, or elliptical 
perspective, or any combination of these. The standards used in 
judging these drawings were the correctness of applying the 
principles to the new objects and the difficulty of the drawing; for 
instance, some pupils drew a simple glass-and-book group while 
others drew the end of a room with the objects found there, or a 
street scene with the various buildings found along its length. 
The criteria set up were similar to those set up by Mathias in 
Artin the Elementary School?* in judging the tower drawing. This 
same drawing of a tower, as set up by Miss Mathias, was given to 
the class in the nature of a test and was scored according to her 
plan. 

The Art Appreciation and Art History scores were secured from 
a class studying these two phases of art together. The course was 
given as a lecture course and covered particularly the fine arts of 
architecture, sculpture and painting. The principles underlying 
a work of art—the principles of design (both structural and deco- 
rative), the material used, and the use of the forms and objects— 
were studied first, as they related to each other, then as they 
appeared or were lacking in simple objects at hand in the class- 
room. The great art periods and their products were then studied 
and analyzed for these same principles. At the end of the course 
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two sets of tests were given, one relating to appreciation, the 
other to identification. Each set was in two parts, one dealing 
with architecture and sculpture, the other with painting. In the 
Appreciation tests, the one on architecture and sculpture con- 
sisted of six objects—three of sculpture and three of buildings. 
The pupils were to make five comments on each including favor- 
able observations. For each criticism showing keen observation 
and careful judgment one point was given.* Each object criti- 
cized had a possible score of 5. The scores secured from each of 
the objects criticized were averaged together to secure the final 
score. It was thought sucha rating was more inclusive than those 
used in other studies where choice between objects alone was 
used to secure the score. 

The appreciation test in painting followed a different procedure. 
It included two groups of painting—one of landscapes and one of 
figures—six examples each: two examples being accepted master- 
pieces, two not considered as highly but by artists of ability, and 
two very mediocre examples. Each painting was to be graded 
by the pupil being tested; an ‘A’ was to be given for the best two, a 
‘B’ for the next-best two, and a ‘C’ for the poorest two in each 
group. In checking the results, two points were given for each 
correct answer and one point for an answer that varied one grade 
from the accepted one. There was a possible score of twenty- 
four points. It was impossible to secure criticisms in this test 
since it was part of a much larger test and could not be easily 
isolated from the rest of the items. 

The identification tests were based on the facts learned in Art 
History. The recognition test on architecture and sculpture con- 
sisted of twenty-five pictures covering eight styles of architecture 
and sculpture. The students were directed to identify each pic- 
ture (tell to which style it belonged) and to give three reasons for 
so assigning it; that is, state three features seen in the picture 





* Comments accepted as suitable criticisms were as follows: ‘“‘The posi- 
tion of the arm, raised in the air, seems uncomfortable. It makes me tired 
to look at it.”” (Sculpture) ‘‘The house does not look ‘homey.’ It looks 
more like a factory than a dwelling.”” (Architecture) ‘‘ It looks as though it 
had floated loose in a flood and got stuck between the two trees.’’ (Archi- 
tecture) ‘‘The feeling is expressed through the whole body, rather than 
through the face.”’ (Sculpture) ‘‘The platform above the door seems to be 
‘stuck on’; seems not to grow out of the building.” (Architecture). ‘‘The 
group is compact. It holds together well.” (Sculpture) 
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that would indicate the example belonged to the style chosen. 
This would necessitate a recognition of facts: knowing the features 
representative of the various styles; recognizing them when seen. 
In scoring the test, three points were given for each correct 
identification and one for each logical reason given* (allowing a 
possible score of 150 points). The extra weighting of identifica- 
tion was given since that part of the test seemed most important 
and inclusive and was to be done first. It also seemed some 
students, because of slowness in recognition, might not be able to 
put down all the reasons in the time allotted to the test. 

The recognition test on painting consisted of twelve examples 
chosen from the various countries and artists studied. The 
students were told to name the country to which the painting 
belonged and the artist who painted the picture, and to give two 
reasons for assigning the painting to that artist and one reason for 
assigning it to that country. In scoring, one point was given for 
naming the artist correctly, one for naming the country correctly, 
one for a valid reason for assigning it to the country, and one point 
for each characteristic of the artist given as a reason for assigning 
the painting to him.t There were 56 possible points. f 

The same group of individuals was used in all problems with the 
exception of Art Appreciation and Art History. Not all of the 
original group of sixty-five pupils were required to take the course 
covering these two phases. Accordingly, in studying these two 
problems, only those in the original group taking this work 
(twenty-two individuals) were used. These scores were also 
omitted when computing total art scores, since they could not be 
secured for all individuals and since they are not ordinarily 
considered as necessary to a general art ability. 





* The following reasons were accepted as logical ones: ‘‘ Fluted Columns, 
Doric order” (Grecian). ‘Front view of the eye in side view of the face”’ 
(Egyptian relief). ‘‘Geometric, decorative designs.”’ (Saracenic). ‘‘Lan- 
tern on the dome.”” (Renaissance). ‘‘High pointed arch.” (Gothic). 

t Naming subject-matter or the costume of the figures appearing in the 
painting as reasons was not accepted. The following are typical of answers 
considered worthy of credit: ‘“‘Use of golden colors”’ (Dutch painting) 
“Nervous feeling created by wavy lines” (Van Gogh) ‘‘ Dark, rich, golden 
coloring” (Rembrandt) ‘‘ Elongation of figures’? (El Greco) 

t There were only 56 rather than 60 points since in two cases the artist 
did not have to be identified, but only the country and the movement in 
painting of which the example was representative. 
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The intelligence scores of the pupils were the averages of the 
1Q’s recorded on their high-school records and those secured from 
scores on the Otis Self-Administering Test taken by the pupils at 
the beginning of the first art course. 

The grades in the other college subjects were secured from the 
permanent college record for each individual. In order to secure 
the average grade for the pupil in each subject, the grades secured 
in the several courses included under that subject were given a 
numerical value (Table II), totaled, and divided by the number of 


TABLE II.—NuMERICAL VALUES ASSIGNED TO GRADES IN SCHOOL 


SUBJECTS 
School Grade........ A A— B+ B B—- C+ C C— D+ D D-=- 
Numerical Value..... 11 10 9 8 7 6 5 4 .s w % 


courses taken. In some subjects few grades were available, since 
few courses were required; in others, where individuals had 
majored in that subject, many more grades were used in com- 
puting the average. 

In order to compare the art abilities with a composite school 
subject score, two average scores were computed: (1) the average 
grades secured by each individual in education, English, science, 
social studies, mathematics, and psychology were averaged 
together to secure an ‘Academic Average’ score; (2) the average 
grades secured by each individual in physical education, music, 
and art were added to those, enumerated above and averaged 
together to secure a ‘Total Average’ score. 


TREATMENT OF THE DATA AND FINDINGS 


Distribution curves were drawn for each art ability using the 
average scores in each and the frequency of their appearance. 
Because of space limitations these six curves are not reproduced 
here. In Curve 1—the Construction Problem—the scores are 
found skewed negatively toward the higher grades with the highest 
distribution —20 at B— and few scores below C. In Curve 2— 
the Design Problem—a narrower range of scores is found, the 
curve itself is leptokurtic, and the highest distribution—20—is 
found at C+. It would seem that ability in design was neither 
outstanding nor particularly poor with the group of individuals 
used. Curve 3—the Perspective Problem—is found skewed 
positively toward thelowscores. The highest distribution—17— 
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is found at C, yet there are few scores lower than C. Curve 4— 
Total Art Scores—is leptokurtic with the highest distribution at 
C+. Curve 5—Art Appreciation—is highly platykurtic, skewed 
in a positive manner toward the low scores. Curve 6—Art 
History—really exhibits little sign of a curve. If it can be called 
one, it is skewed negatively toward the high scores and is even 
more highly platykurtic than that of the Art Appreciation. The 
range of scores is also very wide. 

It would seem from the curves: (1) that Art History and Con- 
struction abilities were easier to acquire or were possessed by more 
students; (2) that strong Perspective and Appreciation abilities 
were attained or possessed by few individuals; (3) while Design 
and Total Art abilities were present with most of the group, were 
concentrated in a medium amount, and were neither lacking nor 
present in quantity to any great extent. 


ART ABILITIES AND INTELLIGENCE AS DETERMINED BY INTELLIGENCE 
QUOTIENTS 


Scattergrams were constructed using the scores for the various 
art abilities and the intelligence scores. These scattergrams, 
of which only a few are reproduced here, bear out observations 
made from the curves of distributions and present addditional 
information. 


IQ’s Scores D D+ C-—- C C+ B- B B+ A-A 


125-129 1 
120-124 2 
115-119 1 
110-114 

105-109 1 
100-104 1 1 
95-— 99 1 

90— 94 1 a). 
85— 89 

80— 84 1 


Figure 1.—Scattergram Using Construction Scores and Intelligence Scores. 
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Figure 1, Scattergram Using Construction Scores and Intelli- 
gence Scores, shows again a tendency for scores to gather in the 
higher brackets with few scores below C. It will be seen that the 
group having an IQ under 100 has, with only one exception, no 
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grade higher than C+, and the group having an IQ over 100 has 
only four scores under C+. In this connection it should be 
noted that the group over 100 in IQ (fifty-two individuals) is four 
times as large as the group having an IQ under 100 (thirteen 
individuals). The three lowest (80-90) IQ’s do not have the 
lowest grades nor do the six highest (120-130) IQ’s (again with 
one exception) have the highest grades. The group having IQ’s 
from 100 to 110 shows the widest range from one above the 
lowest grade to the highest grade for the entire group. The 
coefficient of correlation here is .47 + .07, showing a definite 
positive, although not particularly high, relationship between 
the two. This does not appear to bear out the opinion of Ter- 
man?* and others that the higher IQ’s are irked by mechanical 
problems while the lower IQ’s make a better showing in this type 
of work. 

Figure 2, Scattergram Using Design Scores and Intelligence 
Scores, again reflects the tendency for scores to center around the 
middle grades with only a scattering above and below. Again 


IQ’sScores D D+ C— C C+ B-—- B B+ A- A 


125-129 l 
120-124 1 
115-119 

110-114 1 
105-109 

100-104 1 
95-— 99 

90- 94 1 

85-— 89 

80— 84 


Figure. 2.—Scattergrams Using Design Scores and Intelligence Scores. 
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the group having no IQ over 100 has noart score overC+. The 
group having IQ’s of 100 or better includes several scores under 
C+, but only five under C. The three lowest (80-90) IQ’s do 
not have the lowest scores. The six highest IQ’s are this time— 
with only one exception—listed in the. highest grades. No 
one did very well and very few did poorly. It would seem that 
something besides good general intelligence is necessary to do 
very well in design, although a fair degree of excellence can be 
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attained by those having a lower grade of intelligence. Again, 
the group from 100 to 110 IQ exhibits the widest range, repeating 
that found with the Construction scores: within one grade of the 
lowest and extending to and including the highest scores. The 
Pearson product r between Design scores and Intelligence scores 
is 46 + .07 or practically the same as that between Construction 
scores and Intelligence scores. 

Scattergram Using Perspective Scores and Intelligence Scores 
(not reproduced), reveals information not seen in the curve of dis- 
tribution of these scores. In it there appears a much wider 
scatter of scores over the chart than in any of the scattergrams 
constructed. Nevertheless, scores center quite distinctly in the 
C’s and C+’s—a narrower area of concentration than for Con- 
struction or Design scores, and also a lower grade area. The 
highest score as well as the largest number of high scores recorded 
in any scattergram are found here. There are also more poor 
scores. The highest scores—with one exception—do not belong 
to the highest IQ’s; likewise the lowest scores—with one excep- 
tion—do not belong to the lowest IQ’s. The highest IQ’s 
secured only slightly higher grades (C+ and B—) than did the 
lowest IQ’s (C and D+). 

The widest range of scores this time runs through the 100 to 105 
and the 110 to 115 IQ intervals, spreads over a wider range than 
the previous abilities covered, and includes higher IQ scores. The 
Pearson-product r in this situation was .39 + .07, which bears 
out the above observation that intelligence and perspective 
ability are not particularly related to each other. 

The Total Art scores, as previously mentioned, are secured by 
averaging the scores on Construction, Design, and Perspective 
problems. The scattergram Using Total Art Scores and Intelli- 
gence Scores, shows the relation between these combined scores 
and intelligence scores. The range of scores in this situation is the 
smallest of all, with a centering toward the higher scores. With- 
out exception, the highest IQ’s had no scores under a C+; the 
lowest IQ’s had no scorer over a C+, yet with the exception of 
one high IQ score (B+), the high IQ group did not receive the 
highest scores nor the low group IQ the lowest scores. The widest 
range is found in the 100 to 105 IQ interval. In the group having 
IQ’s over 100, there are only two scores under a C; in the group 
under 100 IQ there is only one score overa C+. The correlation 
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between Total Art scores and Intelligence scores is .52 + .06, 
which is the highest correlation found in this group. 

The Scattergram Using Art Appreciation Scores and Intelli- 
gence Scores shows a very different situation. The number of 
scores in this group is much smaller than in the previous groups, 
as explained earlier. Scores are scattered indiscriminately over 
the whole chart. The highest IQ’s do not have on the whole as 
high scores as the lowest IQ’s. The interval of 105 to 110 IQ’s has 
the widest range of scores. The Pearson-product r in this situa- 
tion is only .08 + .14—very low and statistically unreliable that 
it would remain as a positive correlation. 

The Scattergram Using Art History Scores and Intelligence 
Scores covers the same group of students, and resembles the 
previous scattergrams closely. Although the range of scores 
appears much wider and there is no concentration of scores in 
any particular grade value, the highest IQ’s are found with scores 
as high as any, although not exceeding the others, and the lowest 
IQ’s have one of the lowest scores and one of the high scores. 
The widest range appears in the 100 to 105 and 110 to 115 IQ 
intervals. In the group having an IQ of 100 or more both the 
highest and lowest scores appear. The r in this situation is 
.40 + 12, which is as high as that secured with Perspective 
scores, but it is not as reliable, since its PE is high. 

Viewing these correlation coefficients together (Table III), and 
remembering the information secured from a study of the scatter- 
grams, we can form an idea of the situation existing with the 
students in this study. 


TaBLE III.—CoOEFFICIENTS OF CORRELATION BETWEEN ART 
ScORES AND INTELLIGENCE SCORES 


Criteria : r PE 
IQ’s and Construction Scores................... 47 +.07 
1Q’s and Design Scores.........................  .46 +.07 
IQ’s and Perspective Scores..................... .89 +.07 
IQ’s and Total Art Scores...................... 52 +.06 
IQ’s and Art Appreciation Scores................ .08 +.14 
IQ’s and Art History Scores.................... .40 +.12 


From the coefficients of correlation we find a fairly high degree 
of positive relation between the criteria in all cases but one—Intel- 
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ligence and Art Appreciation. In these cases, except Intelligence 
and Art History, the r’s are definitely statistically reliable, being 
from five to nine times their PE’s. The positive relation between 
Intelligence and Art History is fairly reliable, the r being three 
and one-half times its PE. The r’s secured here compare quite 
favorably with those usually found between IQ’s and school 
subject grades. [See Garrett!* pp. 342-343.] 

It would also seem of interest to view these data from a different 
angle, referring again to the scattergrams. In the scattergram 
using Construction scores and Intelligence scores it was found that 
all Construction scores of B or higher belong—with only one 
exception—to those having 1Q’s of 100 or more. This one—as 
well as those between 100 and 105—would be eliminated if the 
rating was raised to a B+. In Design all scores of B— or over 
belong to the IQ’s over 100. In Perspective all scores of B— or 
over with only one exception likewise belong to the group of 
IQ’s over 100. This same situation is repeated with the Total 
Art scores. In all of these situations where an exception occurred, 
the score is found to belong to the same individual. It might be 
that this student does not have his correct IQ recorded, although 
this can not be determined now. 

In the Art Appreciation scores over thirty-four likewise— 
with one exception—belong to the IQ group of 100 or more, while 
in History and Intelligence all scores (again with one exception) 
in slightly over half of the upper brackets of scores (over a score 
of 215) belong to the group having an IQ over 100. In the last 
two instances there were only three IQ’s under 100. 

Studying the scattergrams in a similar way with the low art 
scores it is found that in the IQ group over 100 only one indi- 
vidual made a score of less than C in Construction, five pupils 
made a grade of less than C in Design, seven made a grade of less 
than C in Perspective, and two made a grade of less than C in the 
Total Art Scores. 

Therefore, it seems that in all but one of the art abilities studied 
here, intelligence is a fairly strong factor in determining success. 

From the studies of the scattergrams we should add some obser- 
vations to this conclusion: (a) those having the highest IQ’s are 
assured of moderate success, although not necessarily the high- 
est; (b) those having IQ’s from 100 to 110 may not only attain 
the highest success but are not entirely prevented from making 





oe ‘ 
4 ” 


? 
: 
¥ 


=e ee S| we 
+s 








414 The Journal of Educational Psychology 


the poorest showing; (c) those having the lowest IQ’s have very 
little chance of attaining much degree of success in art but are not 
cut off from securing a moderate amount of it. 

These findings would agree with those of Kik,?° Kerschen- 
steiner,!? Terman?®*° and others who studied younger children 
or their ‘drawing’ ability. The findings of Tiebout and Meier,?! 
Peck,?’ and others to the effect that a decided drop in correlation 
is found beginning with adolescence does not follow here, unless 
we might decide that art ability and general intelligence again 
show higher correlations with more mature people. Neither of 
these studies used college subjects. The results here also seem 
to differ from Lewerenz’s who likewise tested a battery of abilities 
with high-school (and lower grades) students but secured r’s 
ranging from .009 to .295. The only correlation in this study 
which seems to agree closely with findings in other studies is that 
between Art Appreciation and Intelligence (.08) which agrees 
rather closely with the r’s of .10 and .26 reported by Eurich and 
Carroll, also with college students and with quite different tests. 
It must be remembered that little has been done on this (the 


college) level. 


ART ABILITIES AND INTELLIGENCE AS MEASURED 
BY SCHOOL SUBJECT GRADES 


Pearson product r’s were computed between the scores of the 
various art abilities covered in this study and the scores which 
the same students secured in the subjects they studied in college. 
The resulting data are found in Table IV. 

Correlations run from —.09 (between Art Appreciation and 
Science) to .95 (between Art History and Art grades). If these 
two art abilities are disregarded for the moment (since the number 
included here is so small) and the correlations with Art grades are 
also disregarded (since higher correlations might be expected 
here) correlations are found to run from .23 to .53. These are 
lower than those found by Ayer®® with his normal-school pupils 
but, whereas this study used single art abilities as criteria, he used 
teachers’ grades which, he claimed, included many other factors. 
In comparing drawing ability with school grades he found no 
correlation. 

Art History correlates more highly with other subject grades 
than do any of the other art abilities; next come Total Art, 
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TaBLE IV.—COoOEFFICIENTS OF CORRELATION BETWEEN ART 
ABILITY SCORES AND ScHoo.t SuBJEectT ScoREs 








Art Abilities 
, Con- ; Per- | Total | Att AP- Art 
School Subjects struc- | Design : preci- His- 
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tion ation tory 
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Education..........|.397).07|.39| .07|.396|.07/.50|.06) .29).13).42).12 
sj 66cawewie .24 |.08).35| .07|.36 |.07).43).07| .15).14).69).07 
Science.............|.24 |.08).28)+ .08).36 |.07|.36).07| — .09].14).44/.12 
Social Studies....... .36 |.07|).40| .07).35 |.07|.43).07| .03).14/.70).12 
Mathematics........ .40 |.07|.38) .07|.46 |.07/.48).06) .26).13).31).13 
Psychology......... .31 |.08|.31); .08).40 |.07|.42).07| .26).13).56).10 


Academic Average... .|.39 |.07|.40) .07|.47 |.07|.51|.06; .22).14/.58).09 
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Music..............|.85 |.07|.33) .07|.36 |.07|.41).07| .23).14|.68/.08 
Psuscssuvesnaws .73 |.04|.80} .03).66 |.05).84).02) .33).13).95).01 
Total Average...... .44 |.07|.47| .07).51 |.06).53).06) .22).14).62).09 

N = 65 65 65 65 22 22 





Perspective, Design, followed by Construction, and, last of all, 
Art Appreciation. This is seen to be true when comparing the 
r’s of each art ability with the separate school subject scores, the 
Academic Average, or the Total Average. Each art ability 
correlates more highly with Art grades than with any other 
grade, as might be expected since each had its part in producing 
the Art grade. The lowest of these r’s is with Art Appreciation 
(.33). The explanation of this might lie in the fact that pupil’s 
products were judged in giving art grades and not their critical 
appreciation of these or other art products. The surprisingly 
high correlation of Art History with Art grades (.95) is perhaps 
partly due to the fact that Art grades were secured by giving 
two-thirds value to pupil art products and one-third to test 
results where tests covered facts learned in the art courses. Both 
were thus largely a test of acquired facts. The higher correla- 
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tions of Art History with other school subjects could have 
resulted from similar situations. This seemed to be a fairly 
reliable observation when it is seen Art History r’s exceed other 
art ability r’s in social studies (r is .70), English (r is .69), music 
(r is .68), psychology (r is .56), science (r is .44), education (r is 
.42), academic average (r is .58), and total average (r is .62). 
It is less than the others in mathematics and physical education 
only. 

Art Appreciation r’s are the lowest in the group, reaching a low 
of —.09 with science and a high of .33 with art grades. In 
physical education and music—where one would expect some 
appreciation developed—the r’s although low (.30 and .23, 
respectively) compare most favorably with the r’s of these 
subjects with the other art abilities. In fact, the correlation 
with physical education even exceeds most of the other r’s with 
that subject. Classical dancing and rhythmic gymnastics have 
always been stressed in the physical education courses of these 
students. 

Total Art scores which combined the remaining art ability 
scores would be expected to produce higher r’s than these indi- 
vidual scores. 

Perspective, which correlated lowest with intelligence quo- 
tients, produced higher correlations in most cases with the other 
subjects (English, science, mathematics, psychology, music), 
Academic Average, and Total Average. It correlated to a 
similar extent as the others with education, social studies, and 
physical education. It correlated lowest, in comparison, with 
art. Perspective problems seem to represent some ability that 
is needed for most school subjects, yet is not needed to any great 
extent in performance on a test of intelligence. 

It would be expected that Construction, a mechanical ability, 
would not be needed particularly, or would have little part in 
other school subjects considered in this study. The lower 
correlations here bear this out, yet the r with education is as high 
as any r with that subject, and the r’s are fairly high, also, with 
social studies and music. As might be expected they are some- 
what higher with mathematics and physical education; in fact, 
it has the highest r secured with physical education. Both being 
based upon muscular codrdination, this situation should be 


expected. 
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Design correlations are also fairly low. With most school 
subjects creative expression has little place. The highest 
correlations in this set are with art, social studies, and education. 
These latter two bring up questions. Other correlations do, too. 
For instance, why should Total Art correlate more highly with 
education than with mathematics? 

In studying the correlations between art abilities and art grades 
it is noted that they follow this order (high to low): Art History, 
Total Art, Design, Construction, Perspective, and Art Apprecia- 
tion. The order is as one might expect, since each contributed 
in a similar order to the art grade. There might be a criticism 
of the art courses, themselves offered here—the lack of appar- 
ent emphasis upon appreciation. 


TABLE V.—CoOMPARISON OF ART ABILITIES AND GENERAL 
INTELLIGENCE AS INDICATED BY INTELLIGENCE QUOTIENTS 
AND ScHOOL SuBJECT AVERAGES 
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In comparison with other studies, the r between Art Apprecia- 
tion and art grades agrees quite closely with that which Carroll’ 
found between results on art judgment tests and art grades of 
college students (.40 with the Meier-Seashore test and .14 with 
the McAdory test). The other studies including correlations on 
college level employed criteria not used in this study and there- 
fore cannot be compared. 

It might be concluded here that art abilities correlate fairly 
high with college school grades and about as high as school grades 
are usually found to correlate with each other. 
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In Table V are grouped together the r’s between art abilities 
and those criteria which seem to give a more general indication 
of intelligence: IQ’s, Academic Average, and Total Average. 

The highest r’s are found between Total Average and art 
abilities. The other two criteria divide their favors, one corre- 
lates more highly with one half of the art abilities, the other more 
highly with the other half. The widest divergences in r’s between 
intelligence criteria and art abilities are found with perspective, 
art appreciation, and art history, yet here the r’s are fairly close 
when either Academic Averages or Total Averages are consulted. 
All intelligence criteria correlate almost identically as well with 
Total Art ability and almost equally as well with Design ability. 
Further, it might be concluded that the Total Average rating of 
an individual is a better indication of the type of intelligence 
embodied in all art abilities than are IQ’s and Academic Averages; 
also IQ’s are a better indication than Academic Averages of 
factors necessary for Construction, Design, and Total Art 
abilities while Academic Averages are a better indication than 
IQ’s of the factors needed in perspective, art appreciation, and 
art history. 

Another view of this situation might be found when comparing 
the data of those pupils receiving the highest Total Art scores and 
those receiving the lowest Total Art scores. (Table VI and 
Table VII) Those having a score of B+ and B were considered 
as showing better than average ability.* Those with a B— 
score were considered on the doubtful line between above average 
and average ability. C— scores were interpreted similarly as 
being on the doubtful line between average and below average 
work. These estimates were checked at the extremities with 
Art Grades and were verified by the findings: all those having a 
Total Art score of B or B+ had an Art Grade of B— or higher; 
those having a Total Art score of C— and D+ had Art Grades of 
D value in every case but two, which werea CandC—. Accord- 
ingly it was decided to divide those showing definite art ability 
from those showing average art ability between B and B— scores 
and to separate those showing little art ability from those showing 
average art ability between the C and C— scores. 

In Table VI the data relative to those receiving the highest 
Total Art scores are presented. The average Total Art Score 





* See Table I. 
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and average Art grade are B+. The average IQ is 113.8. 
The averages of subject scores range from B— to A—, show- 
ing a fairly close relationship of high scores with high scores 
throughout. 

In Table VII the data for those receiving the lowest Total Art 
scores are given. 

The Total Art scores average C—, the art grades of these 
individuals D+, and their IQ’s, 97.5. The averages of the other 
subject grades range between C+ and D+, in this case also 
showing a fairly high relationship between low scores with low 
scores. When presented graphically, none of the averages of 
these two groups—the lowest and the highest—meet at any 
point. 

Despite occasional and rare individual variations, it might 
again be concluded with fair certainty that art abilities correlate 
to a fairly close extent with the amount of intelligence possessed 
by individuals as indicated by their IQ’s and by their attainment 
in other school subjects. 


ART ABILITY AND PUPIL ATTAINMENT IN 
OTHER SUBJECTS 


One of the questions raised at the beginning of this study 
referred to whether or not pupils showing definite art ability 
excelled in the same subjects other than art. For this purpose 
the data from Table VI and Table VII were re-arranged into 
scattergram form. The school grades of the pupils receiving the 
highest Total Art scores were arranged according to the frequency 
of their appearance. It was seen that in certain subjects—such 
as art, mathematics, and social studies—and to a somewhat 
lesser extent in education and psychology, all pupils were found 
to be in the higher brackets. This does not hold true for the 
other subjects. English, science, and music show a wide scatter- 
ing of scores. This group, as a group, excelled in mathematics 
and were less able in physical education. The small number of 
pupils in the group makes it impossible to draw definite conclu- 
sions, but within limits it might be said that those pupils who 
excel in art also excel as a group in some subjects, are less able in 
a few, while in others they scatter widely in ability. 

The original problem as stated did not inquire into the ranking 
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of pupils poor in art in their other school subjects, but it seemed 
worth while to include it and, perhaps, make comparisons. 

Although a total wider range of scores is found with this group 
(A— to D—) the range in individual subjects is usually not so 
great except in psychology. On the whole scores are very 
closely grouped within themselves and with each other. The 
scores in education, English, science, social studies, and mathe- 
matics follow one level of attainment with those of physical 
education, music, and art on a slightly lower level. It would 
seem—although our number here is so small that no definite 
statement should be made—that the poor art students do an 
inferior piece of work, poorer in the expression subjects, slightly 
better in the academic subjects. 

In comparison it seems that the lower art students are more 
consistently low in subjects other than art and show less varia- 
bility in the range of attainment in these subjects. 


CONCLUSIONS 


Certain factors stand out in studying the various phases of this 
investigation. ‘Taking each phase alone certain conclusions can 
be drawn, taking all together, others can be deducted. 

Using IQ’s to represent individuals’ intelligence capacity, we 
may conclude: 

1) A good general intelligence is an assisting factor in attaining 
success in art. 

2) Those having the highest scores in intelligence are assured 
of moderate success at least, although not always the greatest 
success, in school art. 

3) Those having intelligence scores from 100 to 110 may not 
only attain the highest success, but are not entirely precluded 
from making the poorest showing in school art. 

4) Those having the lowest IQ’s have very little chance of 
attaining much success in art, but are not excluded from securing 
a moderate amount of it. 

5) A critical appreciation of art does not appear to be affected 
by the degree of intelligence a person shows.* 





* This conclusion appears to agree with Christensen and Karowski** who 
claimed that “Art appreciation ...... brings out a special phase of 
mental activity which is not reached by the intelligence tests.’’ 
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Using college ‘school grades’ to represent an individual’s 
intelligence capacity, we may conclude: 

1) That school grades correlate fairly high with art abilities. 

2) That pupils receiving low art grades are also likely to 
receive similar low grades in other subjects, likewise pupils 
receiving high art grades are likely to receive similar high grades 
in other subjects. 

3) That art abilities correlate with school grades to the same 
extent as school grades are usually found to correlate with each 
other. 

Comparing success in art with attainment in other school 
subjects it is found that: 

1) There is a tendency for those securing high art scores in this 
study to excel in certain other subjects although not consistently 
inall. They excelled definitely in mathematics and were weakest 
in physical education. 

2) Those showing success in art secured higher standing in all 
subjects than those doing poor work in art. 

3) Those doing poor work in art were more consistent in their 
attainments in other subjects. They did slightly better in the 
academic subjects than in the ‘expression’ subjects. 


GENERAL CONCLUSIONS 


General conclusions are evident: 

1. Both the highest and the lowest IQ’s represented in this 
study showed a tendency to be less variable and to tend toward 
the average in all measures of art ability than did the middle 
group. 

2. All art abilities, with the exception of Art Appreciation, 
correlate fairly well (as well as do other school subjects) with 
intelligence—whether measured by intelligence quotients or by 
attainment in other school subjects. 

3. With the exception of Art Appreciation, those pupils 
showing success in one art ability will—with a high degree of cer- 
tainty—show success in other art abilities. 

4. Those pupils receiving the lowest art scores in this study 
were more consistent as a group in their attainments in other 
school subjects. Those receiving the highest art scores, on the 
other hand, as a group appeared to excel in a few subjects, be 
lower in a few, and vary in attainment in the others. 





, . 
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RECOMMENDATIONS 


It is suggested that other abilities in art be studied in a similar 
manner as those in this study particularly for the purpose of 
making revisions if necessary in the art curriculum. It is also 
suggested that a further study of the ability to appreciate art be 
conducted to learn its nature, the factors included in it, and the 


factors which foster it and its development. 
That more time in the art courses be devoted to the apprecia- 


tive factors. 

That the course covering art appreciation be lengthened or a 
separate course in appreciation be offered. 

That the school art grade give less value to test results if it is 
to be a closer indication of art ability. If that is not the purpose 
of the grade, this recommendation may be disregarded. 
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AN ANALYTIC STUDY OF THE MULTIPLE CHOICE 
ANALOGIES TEST ITEM 


GEORGE A. ZIRKLE 
Lt. Comdr., USNR, Classification Research Division, 


Bureau of Naval Personnel* 


This study reports the results of an analysis of two multiple 
choice analogies tests to determine the relative effectiveness or 
drawing power of the various words used as the five choices 
presented for the completion of the analogies. 

The idea for the research grew out of a superficial examination 
of a five-choice type word analogies test administered experi- 
mentally to over six hundred officer candidates undergoing the 
qualification training prior to commissioning in the United States 
Navy. The test was item-analyzed by comparing responses of 
that twenty-seven per cent of the men having the highest total 
scores on the test with responses of the twenty-seven per cent 
scoring lowest. Early examination of analysis results indicated 
that distractors which were synonymous with the third member 
of the analogy were chosen more frequently by men in the low- 
scoring group than by those in the high-scoring group. 

At about the time this trend was noted, a word analogies test 
for Navy enlisted men was being constructed. Following up the 
observation made on the officer test, numerous distractors which 
were positively associated with or synonymous with the third 
member of the analogy were included in the enlisted men’s test. 
At the same time, the decision was made to study analytically 
both the officers’ and the enlisted men’s tests to determine more 
accurately the influence of association of the distractor with the 
third member of the analogy. 


THE OFFICER TEST 


The officer test contains one hundred items. A sample item is 
the following: “‘Travel is to highway as activity is to: (1) trans- 
portation (2) schedule (3) inhibition (4) hiking (5) exercise.” 
Common words were used in the analogies almost entirely so that 





* The material in this article should be construed only as the personal 
opinion of the writer and not as representing the opinion of the United 
States Navy Department. 
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the test might give full scope for the measurement of reasoning 
ability rather than vocabulary accomplishment. 

An index of the degree of positive association of each of the five 
choices in an item to the third member of the analogy was 
obtained in the following way: Four college-trained persons acted 
as judges. They were instructed to pay no attention to the first 
and second members of the analogy in arriving at the judgment. 
Each of the one hundred items was to be assessed as a separate 
unit. If none of the five choices was positively associated in the 
mind of a judge with the third member of the analogy, none was to 
be marked. If one choice, either correct or incorrect, was posi- 
tively associated with the third member, it was to be checked. 
If two or more choices were deemed to be positively related, 
they were to be marked respectively, ‘1’ to show the greatest 
degree of positive association, ‘2’ to show the next greatest degree 
of positive association, etc. Item analysis results were not 
available to the judges when their decisions were made. 

The choices were divided into three groups according to their 
degree of positive association with the third member of the 
analogy. (1) The group having the Greatest Degree of Associa- 
tion included choices which were rated ‘1,’ or most closely 
associated, by all of the judges. (2) The group having a Medium 
Degree of Association included choices which two or more judges 
rated as having some degree of association, with the exception 
of those rated ‘1’ by all of the judges. (3) The group having 
Little or No Association included choices which no judge or only 
one judge indicated as having positive association with the third 
member of the analogy. Six items, all of whose distractors fell in 
this third group, were not included in the analysis of wrong 
choices. They were omitted in order to sharpen the contrast 
between associated and non-associated choices. 

Table I shows the percentages of the highest twenty-seven per 
cent and the lowest twenty-seven per cent of the population 
selecting choices in the different groups. The average product 
moment coefficients of correlation between the continuous varia- 
ble and correct choices in the different groups are also given. 

Correct Choices:—Table I shows that as the degree of association 
increases from little or no association to that of greatest associa- 
tion the tendency for the members of the low group to choose the 
correct choice increases by twenty-five per cent (from 46.5 to 
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71.5), while the tendency for the members of the high group to 
select the correct choice increases only 12.8 per cent (from 78.7 
to 91.5). The closer the correct choice is related to the third 
member of the analogy, the poorer its discrimination between 
high and low scorers. Items whose correct choices are not posi- 
tively associated with the third member of the analogy are more 
difficult, but they discriminate better. 

Incorrect Choices:—Table I reveals that the more closely the 
distractor is associated with the third member of the analogy, the 
greater the tendency for both ‘highs’ and ‘lows’ to choose it. 
The tendency is much more marked in the case of the ‘lows,’ how- 
ever. The difference between high and low percentages where 
there is closest association is much greater (25.6 — 7.8 equals 
17.8) than the like difference in percentage where there is little or 
no association (7.2 — 2.7 equals 4.5). The closer a distractor is 
associated with the third member of the analogy, the better its 
discrimination between high and low scorers. 

At a later date, in order to determine the influence of closeness 
of association between an incorrect choice and the first or second 
member of the analogy, judges marked the incorrect choices 
again. This time their judgments were based on the relation 
between the incorrect choice and either one or both of the first 
two members of the analogy, without any reference to the third 
member. In analyzing the judgments, any choice which two or 
more judges marked as having some degree of positive association 
with the first or second member of the analogy was categorized 
as so associated. Some of these associated choices were also 
considered by the judges to be related to the third member of the 
analogy, according to the earlier marking. In such cases, the 
choices were not tabulated. 

Seventy-two choices were judged as associated with the first two 
members but not the third member of the analogy. One and a 
half per cent of the ‘highs’ selected these choices as against 5.3 per 
cent of the ‘lows.’ These percentages are lower than like per- 
centages in the group of distractors having little or no association 
with the third member of the analogy (high, 2.7 per cent; low, 
7.2 per cent). It thus appears that choices which have little 
or no positive association with any of the first three members of an 
analogy pull slightly better than distractors which do bear a 
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positive relationship to either the first or second members of the 
analogy, or both. 


THE ENLISTED MEN’S TEST 


The advantage of a superficial analysis of the officer test was 
already at hand when the enlisted-man test was constructed: 
Consequently a greater number of distractors possessing close 
relationship with the third member of the analogy were intro- 
duced. The test contains ninety-six items. These items are 
easier than those contained in the officer test but are phrased in 
the same fashion. The test was administered to more than four 
hundred Navy enlisted men in a recruit training station. The 
group was a much less selected one than the group of midshipmen 
which took the officer test. 

Six judges assessed the degree of positive association of each of 
the five choices in an item to the third member of the analogy. 
Four of these judges were college-trained and two had had only 
high-school training. ‘The same techniques of marking degree of 
association was used as with the officer test. Item analysis data 
were not available when the judgments were made. 

Four groups of choices were categorized in terms of their degree 
of positive association with the third member of the analogy: 
(1) Choices having the Greatest Degree of Association were those 
which four or more judges marked ‘1.’ (2) Choices having the 
Next Greatest Degree of Association comprised those which four 
or more judges marked ‘2’ or the equivalent thereof, with the 
exception of those belonging in the first group. (3) Choices hav- 
ing Some Association were those which three or more judges 
marked as associated, but which did not measure up to the stand- 
ards for the first two groups. (4) Choices having Little or No 
Association included those which no judges or no more than two 
judges marked as associated. 

Table II summarizes the results of the analysis for the Enlisted- 
man Test. The same tendencies appear in this table, which 
shows four degrees of relationship, as in Table I for the Officer 
Test, which shows only three degrees of relationship. Here they 
are more marked, however. 

Correct Choices:—T able II shows that a decrease in the degree 
of association of the correct choice with the third member of the 
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analogy is accompanied by an increase in the discrimination of the 
choice. This increase is shown both by the differences in ‘high’ 
and ‘low’ percentages and by the coefficients which measure the 
value of this difference. 

Incorrect Choices:—On the other hand, decrease in the associa- 
tion of the false choice with the third member of the analogy is 
accompanied by decrease in its discriminating value. This is 
shown by the decrease in the difference between ‘high’ and ‘low’ 
percentages. 

At a later date, six judges marked the test to show degrees of 
positive association of false choices with either one or both of the 
first two members of the analogy. No consideration was paid to 
the third member of the analogy in the marking process. Choices 
which were marked by three or more judges as being positively 
related to either one or both of the first two members of the 
analogy were so considered for analysis purposes. Some of these 
had earlier been marked as positively associated with the third 
member of the analogy and were excluded from further considera- 
tion. A total of forty-four choices met the criterion established. 
The percentage of ‘highs’ selecting false choices associated with 
either or both of the first two members of the analogy was 1.4. 
The like percentage of ‘lows’ was 6.4. Here, as with the officer 
test, fewer ‘highs’ as well as ‘lows’ selected these choices than 
selected choices which bore little or no positive relation to any of 
the first three members of the analogy. 

The discriminating values of items in this test are superior to 
the values of other word analogies items constructed in this divi- 
sion. It appears very doubtful that this difference in value can 
be explained without reference to the inclusion in the subject test 
of many more choice words which are very closely associated with 
the third member of the analogy. Predictions were made about 
the item analysis on specific choice words before the analysis was 
available. These later proved to be correct in a large majority 
of cases. The basis for prediction lay in assessing the degree 
of positive association between a choice word and the third 
member of the analogy and reasoning therefrom. Conversely, 
it was possible in a large majority of the cases to predict from the 
item analysis whether a choice word was closely associated with 
the third member of the analogy or not. 

In passing it may be noted that there were a few cases of dis- 
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tractors which were opposite in meaning to the third member of 
the analogy. Analysis of these few cases showed a tendency for 
the ‘lows’ to select such distractors more often than the ‘highs.’ 


DISCUSSION 


Results of analysis of these two tests will raise the pertinent 
question as to whether they are measuring the ability to discern 
analogous relationships any more than the ability to recognize 
positive associations between choice words and the third member 
of the analogy. Certainly it must be said that those who select 
a correct choice word which is closely associated with the third 
member of the analogy do so for varying reasons. Many doubt- 
less will choose it because it completes the proper analogous 
relationship. Some will choose it for chance reasons. Others, 
entirely aside from any consideration of the first two members of 
the analogy or the relation between them, will choose the correct 
word because it is positively related to the third member of the 
analogy. This research indicates that the number of these latter 
will not be inconsiderable, particularly among the ‘low’ scorers. 

What is true for selection of the correct choice is also true for 
selection of false choices. Here, again, it is the low more often 
than the high scorer who selects the distractor which is positively 
associated with the third member of the analogy. 

In the two tests studied, the more closely a word choice is 
related to the third member of the analogy the more tendency 
there is for it to be chosen. However, it is of no apparent advan- 
tage to introduce word choices which are positively associated 
with either one or both of the first two members of the analogy, 
but not with the third. Indeed, the evidence indicates that 
choices which are related to none of the first three members of the 
analogy will be chosen more often. | 

So far as these findings are concerned, this conclusion may be 
drawn about the word analogies item which tends to produce the 
maximum degree of discrimination between low and high scorers: 
It is an item whose correct choice bears little or no positive asso- 
ciation with the third member of the analogy but whose dis- 
tractors are closely associated with it. 


THEORETICAL IMPLICATIONS 


We may ask why it is that low scorers more than high scorers 
are likely to avoid correct choices which are not closely associated 
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with the third member of the analogy and to choose false alterna- 
tives which are so associated. No doubt there are many factors 
to account for this. One area of explanation will be commented 
on here briefly. 

When a person must decide how an object is related to another 
object, one of the simpler mental operations is the determination 
of whether there is any likeness or ‘belongingness’ between them. 
It may be supposed that high more than low scorers are able to 
perceive the analogous relationship between the first members of 
the analogy and then carry over this relationship to the third 
member and the proper one of the five choices. On the other 
hand, low scorers are less able to perceive and carry over the 
analogy. In default of this, they will cast about and indicate or 
check a simpler type of relationship, one in which the words are 
like each other in some way or ‘belong’ together. It may even 
be that low scorers will indicate or check a close or ‘belonging’ 
relationship when they recognize the proper analogous relation- 
ship. This could be under the spur of rapid decision in a timed- 
test when the ‘likeness’ or ‘belonging’ relationship has a more 
fundamental or seductive appeal to the testee than the analogous 
relationship. 

Further investigation alone can give us the answers to the 
problems posed here. It does seem to the author that this gen- 
eral area of approach to some of the problems of differential 
abilities may prove a singularly rewarding one. 
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FURTHER EVIDENCE ON THE UNEXPECTED 
LARGE SIZE OF RECOGNITION VOCABU- 
LARIES AMONG COLLEGE STUDENTS 


GEORGE W. HARTMANN 


Teachers College, Columbia University 


In an earlier report in this JouRNAL,! the present writer called 
attention to a variety of methodological errors which vitiate most 
of the well-known estimates of absolute or total vocabulary size, 
and submitted an extensive mass of data indicating that the aver- 
age adult’s demonstrable word knowledge has been greatly 
underestimated. Since the publication of this article, no serious 
dissent from the conclusions there presented has appeared, 
although there have been informal attempts at partial or alter- 
native explanations based upon such considerations as: (1) the 
difficulty of determining what constitutes a separate or distinct 
word, (2) the unscientific or nonstatistical conventions of dic- 
tionary-makers and printers, (3) the existence of multiple mean- 
ings for identical symbolic forms (i.e., the semantic count 
concept), etc. All these were recognized in the original paper 
which nevertheless indicated that the best available figures show 
that the average undergraduate has a minimal conceptual under- 
standing of at least half of all the entries in the latest unabridged 
dictionaries. This means in literal truth an operating or reading 
recognition vocabulary of approximately 200,000 words} 

Most educators and psychologists have viewed this as an excep- 
tionally high estimate—‘high’ only because Seashore’s estimate, 
the highest reported previous to the present writer’s calculations, 
did not exceed 75,000—which is in turn much larger than 
the amounts given in the older literature where figures low in the 
hundreds or at the most a few thousand appear. Apparently, the 
later and more critical methods reveal a steady tendency to credit 
the products of our schools with a progressively larger vocabulary. 
Whatever skepticism has been directed toward the writer’s con- 
tentions does not affect the actual figures reported, but centers 
about the interpretation or reflects a suspicion that some hidden 
procedural joker contains the secret of these unprecedented large 


values. 
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In an effort to see whether essentially the same magnitudes 
formerly found were still recently obtainable under such altered 
conditions as: (1) a very different group of subjects, (2) regionally 
and culturally quite distinct populations, and (3) measures taken 
more than a decade apart—the investigator selected fifty words 
(one from the same relative position on every fortieth page) from 
the latest available unabridged Merriam Webster’s New Inter- 
national Dictionary and administered them to one hundred six 
members of the usual four undergraduate classes at the Alabama 
Polytechnic Institute during the Summer Quarter of 1945. This 
was a sample of about one student in ten then in attendance. 
While it is possible that this Summer Quarter student body as a 
whole was somewhat more academically ambitious than one 
comprising persons unwilling to endure the climatic punishment 
involved in college work at that time, the randomized way in 
which respondents were obtained assures reasonable ‘typicality’ 
in the simple findings here reported. 

The subjects were simply directed without time limit to supply 
the meanings of these terms which were presented to them in 
column fashion on mimeographed sheets. A word was scored 
‘correctly defined’ if the elements of the answer indicated that the 
respondent had some familiarity, however slight, with the true 
significance of the term. 


TABLE I.—ScoreEs iN TERMS OF NuMBER RiIGut ON A Firty-ITEM 
VocaBULARY Test TAKEN BY REPRESENTATIVE AUBURN, 
ALABAMA, STUDENTS 


Class N Mean Lowest Score Highest Score 
Freshmen......... 26 23.54 16 29 
Sophomores........ 29 25.35 19 32 
Sumiors............- Bi 2.75 18 34 
EE 30 29.87 19 36 

Rd dn Wisk ob wa 106 26.86 16 36 


The essential findings appearin TableI. The steadyriseinthe 
averages from year to year probably represents the combined 
effects of the instructional program and the gradual selection of 
the more verbally-minded for a degree. 
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Since every sample word included in the test list ‘represents’ 
approximately 8,000 words in the unabridged lexicon of at least 
400,000 words from which it is derived, it appears that we may 
rightly attribute to the average land-grant college student in 
Alabama a vocabulary of 215,040 words, roughly one-half of 
those listed in a ‘complete’ dictionary. If this figure seems 
fantastically high (its ‘spuriousness’ so far as exactness goes is 
readily admitted since it can never be more than a relatively 
crude estimate, for the multiplying factor may really be nearer 
10,000), particularly in view of the well-known comparative 
academic limitations of Southern students, one can only observe 
that similar huge values are regularly obtained from college stu- 
dents anywhere when this conventional method of determining 
vocabulary size is employed. So far, no one has detected any 
defect in this procedure, and it seems but fair to conclude that 
the American undergraduate is not as restricted in his knowledge 
of words as many of his severest critics have assumed. 

The figures given above must be taken as added confirmation 
of the position adopted by the writer as early as 1930, viz., that 
the recognition vocabulary of the reading public is very much 
larger than most disparaging educators, psychologists, advertisers, 
politicians, and language experts have generally believed. Rins- 
land’s? recent exhaustive tabulation of the active writing vocabu- 
lary of American elementary-school children showed that the 
total number of different words used in this way rose from 5,099 
in the first grade to 17,930 in the eighth. Commenting on this, 
Rinsland rightly declares (p. 20): ‘‘Certainly, from previous 
word lists, few students of children’s vocabularies would have 
predicted the finding of as many as 25,632 different words or 
14,571 words occurring with a frequency of three or more in any 
one grade.” 

The implications of this for the school of advocates of over- 
simplified writing must be obvious. Certainly, the persistence of 
the unwarranted view that the ‘average American’ has at his 
disposal no more than a few hundred, or, at the most, a few 
thousand words, becomes more untenable and unjustified than 
ever before. ‘Illiteracy’ in terms of effective command of ideas is 
undoubtedly widespread by any standard—but this phenomenon 
should not be confused with unfamiliarity with the symbols of 


communication. 
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BOOK REVIEWS 


Frans ALEXANDER AND THOMAS Morton FRENcH. Psycho- 
analytic Therapy and Its Implications. New York: The 
Ronald Press Co., 1946, pp. 353. $5.00. 


It was somewhat of a puzzle to this reviewer as to why the 
publishers of this book sent it to be reviewed in the JouRNAL oF 
EDUCATIONAL PsycHoLoGy. The book is by two M.D.’s, one of 
whom is the director of the Chicago Psychoanalytic Institute and 
the other a member of the staff of this Institute. Heretofore, 
books on psychoanalytic technique have been written for and 
limited to qualified psychoanalysts who have maintained very 
close and exclusive restrictions on their art. I find in the preface 
to this book, however, that it is addressed to psychiatrists, 
psychoanalysts, psychologists, general physicians, social workers, 
and to all whose work is closely concerned with human relation- 
ships. This is a remarkable liberalizing step, credit for which 
must go to the Chicago Institute. Addressing the book in this 
way to members of several different professional groups, the 
authors are not only concerned that these different professions 
should understand principles of psychoanalytic therapy, but 
employ them to the degree that would be called for in their 
professional work. 

This book is noteworthy in that for the first time in psycho- 
analytic circles the procedures laid down by Freud and inflexibly 
employed by his followers are relaxed. Instead of rigidly 
following a uniform procedure it is recommended that the 
procedure be modified to fit the needs of the individual client. 
Emphasis is placed on making a diagnosis of the dynamic struc- 
ture of the individual case and planning a treatment program in 
the light of the dynamic structure of the personality which is 
revealed. This plan itself should be flexible as additional insight 
in the case is gained by the therapist. Considerable emphasis 
is put on understanding the needs of the case and the psycho- 
therapeutic procedures are fitted to the needs. In the past 
psychoanalysts have thought in terms of complete analysis which 
means going back into the infantile origins of present personality 
trends. In this book the goal of treatment is a practical one. 
The client is helped to function more adequately in his work or 
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in his family and the therapy does not attempt to go beyond these 
practical needs. Whereas in the orthodox psychoanalysis there 
is insistence that treatment sessions should be held daily, the 
principle is laid down in this book that the treatment sessions 
should be spaced as far apart as possible in order to carry the 
treatment process on at all. Frequent sessions are recommended 
only when anxiety is acute, and to the degree that the client can 
take responsibility for his affairs, the treatment sessions will 
be spaced widely. 

This new point of view is to be compared with the theory of 
psychotherapy advanced by Carl Rogers in his book, Counseling 
and Psychotherapy: which has already attracted the widespread 
interest and endorsement of American psychologists. Whereas 
Rogers advocates counseling practices more like those employed 
by psychoanalysts (although he expressly denies this) than was 
customary with psychologists and counselors who used didactic 
procedures, the present book proposes psychoanalytic practices 
closer to those which have been employed by lay counselors than 
have customarily been recommended by psychoanalysts. Asa 
matter of fact, the two positions are now not far apart and they 
differ only in one or two important respects. First of all, Rogers 
does not recognize flexibility in his theory of psychotherapy but 
trains counselors to follow the process of attending to the client’s 
feelings and verbalizing them in rather slavish fashion. In the 
present book, however, the whole emphasis is on flexibility and 
adaptation of the procedure to the needs of the client as evi- 
denced by the dynamics of the personality structure. Rogers 
places little emphasis on diagnosis. It is not necessary in his 
system because every client is handled in the same way. Alex- 
ander and French, however, place considerable emphasis on 
diagnosis which does not depend on standardized psychological 
instruments but rather on the insight which comes from inter- 
preting the meaning of the client’s attitude and events in the 
client’s previous history as revealed in the opening interviews. 
Rogers stresses ignoring content and concentrating the attention 
on the feelings expressed by the client. Alexander and French 





1Carl R. Rogers. Counseling and Psychotherapy. Boston: Houghton 
Mifflin Company, 1942. 
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would have the therapist pay attention to content in order to 
formulate hypotheses regarding the dynamics of the case. How- 
ever, they would probably agree with Rogers that their formu- 
lation of the dynamics of the case would not be interpreted 
directly to the client and in many respects they would proceed 
somewhat as Rogers recommends except that there would be 
greater flexibility as would be called for by the individual case. 

In this reviewer’s opinion, the point of view expressed in this 
book has value not only for counselors but for teachers. Teachers 
are responsible for guiding the personality development of chil- 
dren. They must work through a personal relationship and not 
infrequently they must overcome resistances to learning which 
have developed in the pupil. It is not enough to say that the 
teacher’s task is to educate educable children. The teacher must 
not only do that but help pupils to become educable. This 
reviewer would like to predict that the very psychoanalytic prin- 
ciples which seemed a few years ago to be so esoteric and myster- 
ious and the property of a few initiates into the art will be, not too 
many years distant, a part of the teacher’s art and skill. It is for 
this reason that it is believed that in sending this book to be re- 
viewed in the JouRNAL oF EpucATIONAL PsycHoLoGy the 
authors also see the wider implications of the principles of psycho- 
analytic therapy which they have elucidated. 

PercivaL M. SymMonps 
Teachers College, Columbia University 


SipnEY Hoox. Education for Modern Man. New York: The 
Dial Press, 1946. pp. 237. 


Education for Modern Man is primarily a critical and aggressive 
analysis of two conflicting philosophies of education. That 
educational philosophy whose chief exponent in America is Robert 
M. Hutchins is contrasted with that of John Dewey. An 
evaluation of each philosophy accompanies the writer’s analysis. 

Generic problems which are discussed include: (1) What should 
the aims or ends of education be, and how should we determine 
them? (2) What should its skills and content be, and how can 
they be justified? (3) By what methods and materials can the 
proper educational skills and content be most effectively com- 
municated in order to achieve the desirable ends? (4) How are 





Book Reviews 443 


the ends and means of education related to a democratic social 
order? These questions are analysed with the gloves off and 
one can almost hear those who agree cheering as they read, and 
those who disagree bristling with rebuttal. 

As Hook draws up the score sheet the major conflict between 
the two philosophies is being fought out in issues such as the 
following: experimental and scientific approaches vs. theological 
and metaphysical approaches, ends of education justified by 
consequence in experience vs. ends measured in terms of ‘culti- 
vation of reason,’ education varying according to needs in a 
particular time and place vs. education that ‘should everywhere 
be the same,’ content of education based on relevance of problems 
considered vs. curriculum organized around the materials of the 
past, and education based on the results of investigation by 
critical method vs. authority of creed—religious, social or political. 
The reviewer has set up the issues in such a way that the pro- 
ponents of experimental and scientific approaches tend to support 
the position first mentioned in each of the other issues listed. 

Although Hook strongly believes the weight of evidence is on 
the side of the position first stated in each case in the preceding 
paragraph, he attempts to give the opposite side in each of the 
issues a fair hearing in presenting the strongest arguments for 
his opponents as he sees the situation. This he does with con- 
siderable clarity, although some would say he is too aggressive 
and dogmatic in arguing his own case. 

A chapter in the book is devoted to Hook’s concept of a good 
teacher. He believes that “the most satisfactory teaching in 
American education is being done on the most elementary levels 
wherever plant facilities are adequate. The least satisfactory 
teaching is being done on the highest levels,’’ meaning the liberal 
arts college. Primary causes for the comparative deficiencies of 
college teaching he believes to be three. The first is failure to 
clarify the function of liberal education, and the dual réle the 
faculty is expected to fill as teachers and research workers. The 
second is the absence of any training in college teaching, indeed 
in any kind of teaching, despite the fact that there are certain 
common psychological and philosophical principles which hold for 
all varieties of instruction. The third is the indifference, almost 
hallowed now by tradition, to pedagogical questions. The 
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good teacher has intellectual competence, patience, lesson- 
planning ability, knowledge of human beings, sympathy and 
vision. Few, perhaps, would object to these criteria, but the 
reemphasizing of them with new illustrations is worth while. 
Some might wish that Hook gave more attention in his illus- 
trative material to education on the elementary and university 
levels. However he probably believes similar principles apply 
regardless of level, and the reviewer is inclined to agree. 
Whether one agrees or disagrees with Hook’s conclusions and 
his strong support of Dewey’ philosophy of education, one must 
agree that the book is stimulating, for it is written by a man with 
ideas who can express them with telling force on the printed page. 
This book is well worth reading by those who wish to refresh and 
stimulate their thinking about conflicting philosophies in modern 
education. Ray H. Simpson 
University of Alabama 


JosEPH WILFRID Tait. Some Aspects of the Effect of the Domi- 
nant American Culture upon Children of Italian-Born 
Parents. New York: Teachers College, Columbia Uni- 
versity, 1942, pp. 74. 


This study is based upon seven hundred thirty-four children 
of Italian extraction, eleven to fifteen years of age, who were 
attending five American public schools. The foreign enrollment 
in the schools was one hundred per cent, seventy-five per cent, 
fifty-five per cent, forty per cent, and thirty per cent, respec- 
tively. A control group was employed consisting of three 
hundred sixty children of native American parents and 
grandparents. 

The data suggest that the more Italian children come into 
contact with native American children, the more they experience 
inferiority feeling, and the poorer their social adjustment. As 
the Italian children become older, they tend toward better 
adjustment, although this result may be due to a loss of frankness 
in answering the questionnaires. Boys are more aware than are 
girls of the prejudices which exist. Maladjustment, awareness 
of rejection at the hands of the native American children, and 
unfavorable reaction toward the foreign background are all 
positively correlated, the r’s ranging from .40 to .58. For the 
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American children there is a correlation of .63 between the degree 

of rejection of foreign children and the lack of social contacts. 
The study, evidently a thesis under the sponsorship of the late 

Professor Rudolf Pintner, appears to have been carefully made 

with adequate statistical controls. MELVIN G. RiGe 
Oklahoma A. & M. College 
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